FERMILAB-Pub-95 /375-A 
submitted to Astrophysical Journal 



Determining cosmic microwave background 
anisotropies in the presence of foregrounds 



Scott Dodelson 

NASA/Fermilab Astrophysics Center 
Fermi National Accelerator Laboratory, Batavia, IL 60510-0500 



ABSTRACT 

Separating foregrounds from the signal is one of the big challenges in cosmic microwave 
background (CMB) experiments. A simple way to estimate the CMB temperature in a given 
pixel is to fit for the amplitudes of the CMB and the various foreground components. The 
variance squared of this estimator is shown to be equal to [(FDF)^ af^ + cighapo]) where af'^ 
is the variance in the absence of foregrounds; Ushape is the variance due to the uncertainty 
in the shapes of the foreground components; and PDF is the foreground degradation factor. 
This one number, the PDF, gives a good indication of the ability of a given experiment 
to disentangle the CMB from foreground sources. A variety of applications relating to the 
planning and analyzing of experiments is presented. 



1 Introduction 



The cosmic microwave background encodes a great deal of information about our universe. 
In particular the anisotropies - and especially those on small scales - are sensitive to many 
cosmological parameters and to the initial perturbations which grew into the large structures 
observed today. Thus a map of the anisotropies in the CMB on small scales can unequiv- 
ocably answer questions that have plagued cosmologists for decades (or longer). For this 
reason, a number of groups have set out to make such maps of the sky at varying angular 
resolution, typically better than half a degree. 

There are many complicated experimental issues involved in making such maps. However, 
even an experiment perfectly designed to minimize atmospheric contamination, sidelobes, 
1// noise, etc. still has to deal with the reality of the sky. And this reality includes 
not only the "signal" in the form of CMB anisotropies but also "noise" in the form of 
galactic and extragalactic foregrounds. The most powerful way to extract the CMB signal 
from foreground contamination is to take measurements at many different frequencies. The 
CMB anisotropies vary with frequency differently than do the foregrounds. By using the 
knowledge we have about these different spectral shapes, we can conceivably extract the 
CMB component from the total signal. 

In this paper, I will discuss how to perform this extraction. For a given set of frequencies 
and given number of foregrounds one wants to eliminate, we can define an estimator, 9^™^ , 
for the true CMB temperature by fitting the amplitudes of a CMB component and various 
foreground components to the observed temperatures in each channel. On average this 
estimator will equal the true CMB temperature f^™*^. The variance of the estimator depends 
on the instrumental and atmospheric noise of course. But it also depends on the frequency 
coverage and the foregrounds. In fact the variance can be simply expressed as 



where cTg is the variance in the absence of foregrounds (due to instrumental and atmospheric 
noise) and cXshape is the contribution to the variance due to the uncertainty in the spectral 
shapes of the foregrounds. In many cases, cTghape will be small, so the variance is enhanced 
over the no-foreground case by the foreground degradation factor, FDF. By construction, 
FDF is greater than or equal to one. Thus the effectiveness of any given set of frequencies 
can be expressed by this one number. If the FDF for a given frequency set is large, then 
the contaminating foreground is troubling; if FDF is close to one, then foregrounds may be 
effectively eliminated. 

Another feature of equation |l| deserves mention. The first term is proportional to the 
instrumental and atmospheric noise; we will see that the second is proportional to the rms 
amplitude of the foregrounds. While typically the first term dominates, there are situations 
- e.g. in experiments with very low noise per pixel or experiments in dusty regions of the 
sky - where the second term is most important. In these cases, the channels should be 
constructed to minimize o"shape- 



al = (FDF)2 + al 
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Section 2 presents a prescription for calculating FDF and (Jghape for a given experimental 
configuration and set of foregrounds. Some of the details are relegated to Appendix A. 
Section 3 presents a number of applications of this analytic technique; questions which 
might come up in designing or analyzing an experiment which can be simply addressed with 
the concept of FDF and CTghape- 

This method of estimating the CMB temperature was carried out by the MSAM team 
in Cheng et al (1994) when analyzing their data. In several previous papers (Dodelson & 
Stebbins 1994 and Dodelson & Kosowsky 1995) my collaborators and I analyzed a variety of 
experiments using an apparently different technique, that of marginalization. In Appendix 
B, I show that the two techniques arc in fact identical. 

I should point out that there has already been a good deal of work on the issue of fore- 
grounds. Perhaps the most influential has been the paper by Brandt et al. (1994). Without 
getting into the details of their work, I simply point out that their basic technique is the 
Monte Carlo. Here I am more interested in seeing what can be done analytically. In §3.4, 

1 compare this analytic approach with their Monte Carlo methods and find excellent agree- 
ment. The work of Toffolatti et al (1994), Danese et al (1995), and Tegmark & Efstathiou 
(1995) uses information from other maps, such as the IRAS (Neugebauer et al 1984) map of 
dust. Although the formalism discussed in this paper can probably be extended to include 
such maps, here I do not attempt to do so. So the conclusions reached here are probably 
on the conservative side (I assume that less is known about the foregrounds). The fact that 
these conclusions are still reasonably optimistic is encouraging and offers still more evidence 
that foregrounds will not be an intractable problem for a satellite mission. 

2 CMB Estimator and Variance 

This section is divided into three parts. First there a brief discussion of notation; this provides 
the information necessary to translate the experimental/foreground information into the 

vectors used to calculate FDF and (Jghape- The second subsection derives the estimator of the 
CMB temperature and its variance. One simply performs a best fit to the free parameters: 
the amplitude of the various components. Calculating the variance of this estimator leads 
immediately to the concept and definition of FDF and aghape- Section 2.3 then presents a 
simple formula for the FDF in the presence of one and two foregrounds. 

2.1 Notation 

I will label the number of frequency channels with a subscript a = 1, . . . , iVch- The observed 
antenna temperature in each channel is denoted T^. It will be convenient to group all A^ch of 

— * 

these numbers into an A^ch— dimensional vector T. The observed signal is composed of the 
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CMB component, foreground components, and noise, so 



Nig 

f^J2T' + N (2) 



i=0 

where the CMB component has superscript and the i = 1, . . . , Nfg foreground components 
are appropriately superscripted; N denotes the contribution to the signal from instrumental 
and atmospheric noise. The noise is assumed Gaussian with 

{N)^0 ; {Nam)^Cab. (3) 

Throughout, will be used to refer to the true temperatures on the sky. These are to be 

— * . 

distinguished from the estimators, O*, which represent our best guess about these tempera- 
tures. These estimators will assume that the shape of the foregrounds and CMB are known 
and take the amplitudes as free parameters. Thus, we set 

= F'e' (4) 

where 9^ is the (unknown) amplitude of the i^^ component and is the (presumed) shape 
of that component. As a simple example consider the CMB component. We know that it 
has a blackbody shape, after subtracting off the mean, 

where Xa = 2Tihi'a/kBT and T = 2.726°K, the average temperature. With this shape vector, 
the amplitude ^° is the estimate of the thermodynamic temperature anisotropy, which of 
course is frequency independent. Note that in the Rayleigh- Jeans hmit (x^ — > 0), F° — > 1. 
The CMB shape vector has a "over it to denote unit vector. That is, ■ — 1, where the 
dot product of any two vectors is defined as 

f-^=(7f J2TaC-\bS,. (6) 

a,b=l 

The prefactor here, , is the variance in the absence of foregrounds and can be written as 

4°^' ^ ;V -^^ (7) 

We will see shortly that this is indeed the variance in the no-foreground case, but one can 
immediately see that this is reasonable by considering the case of equal and uncorrelated 
noise with variance a in each channel in an experiment with frequencies in the Rayleigh- 
Jeans limit. Then af'^ a/^/Nch, the correct limit. Finally, it will prove useful to introduce 
the (iVfg + 1) X (A^fg + 1) matrix 

Kij = F'-F^. (8) 
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2.2 Best-Fit Estimator and its variance 



To determine the amplitudes of the various components, we can minimize the variance: 

d 



Q0l{\T-i:F^O^j ) = 0. (9) 

Appendix A provides the straightforward details of this minimization. The result is that the 
estimator for the CMB temperature is 

= Y,K~\jF^ -T. (10) 



It is important to note that the estimator in equation |T0| is linear in the observed temperature 
T. Therefore, if the noise around T is Gaussian, then the noise around 6 will also be Gaussian. 
Had we allowed the foreground shapes to vary as well, the transformation would no longer 
be linear, and there would be no reason to expect the noise to be Gaussian. 

Equation ITO is an estimate for the thermodynamic temperature anisotropy of the CMB. 



How good an estimator is it? To answer this question, we need to compute 

-.^-(K-tf) (11) 

where t° is the true CMB thermodynamic temperature on the sky. A short calculation 
(presented in Appendix A) shows that this variance is given by equation with 



PDF = ^K-ioo (12) 

and: 



'''shape 



'^fg ^fg \ ' 

ET'-EK-'ojFn . (13) 

:1 j = J 

One important limit of equation || is when no foregrounds are projected out A^fg = 0. In 
that case, the matrix K has only one component, the oo component which is unity. Thus 
FDF = 1 and the variance is equal to defined in equation 0. Another important 

limit is when the foreground shapes are known. In this limit, o"shape vanishes. To see this, 
note that if we have chosen the correct for the foregrounds, then the true foregrounds 



are proportional to F*. Then, the dot product T* ■ in equation |T3| is proportional to Kij. 
When multiplied by K~^Qj and summed over j, this gives a delta function, 6oi which vanishes 
for all foregrounds i > 0. 

2.3 FDF in the presence of one or two foregrounds 



The FDF can be easily calculated via equation 12 once the matrix K is known. K in turn 



depends on the assumed foreground shapes via equation R|. Here I present the results for 
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FDF in the cases where (i) one foreground is to be projected out and (ii) two foregrounds 
are to be removed. 

If there is one foreground to be removed with shape vector F^, then the matrix K depends 
on only the dot products ■ F^ and F^ ■ F^: 



K 



po.pl 



F^-F' 
pi.pl 



(14) 



The inverse of this K is readily obtained: 



1 



so that 



pi-F^- (FO • F 



FDF 



1\2 



F'-F' 
_po . pi 



.po . pi ■ 



1 



nl/2 



(15) 



(16) 



l_(F0.Fi)2/Fi-FiJ 

The limits of equation |1^ are interesting. If the foreground component has a much different 
spectrum than the CMB, then their shape vectors will be much different, and F^ ■ F^ ^ 0. 
In this case, FDF goes to one. That is, a foreground component with a shape much different 
than that of the CMB does not degrade the sensitivity of the experiment. On the other hand 
a foreground component with a shape very close to that of the CMB (so that (F° ■ F^Y ^ 
F^ ■ F^) produces a very large FDF. To minimize the FDF in a given experiment then, 
one must measure at frequencies designed to maximize the "angle" between the foreground 
spectrum and the CMB spectrum. 

In the case when there are two foreground sources to project out, we define the three 
angles: 



cos (pi 



^cmb 



■F' 



cos (p2 



^cmb 



■F' 



COS012 = -F^. (17) 



Then, moving through the same steps as in the one foreground case (but this time with the 
aid of Mathematica) , one finds that 



FDF 



sm (pi2 



sin 012 — COS^ 01 — COS^ 02 + 2 cos 01 cos 02 COS 012 



1/2 



;i8) 



Note again that in the limit that one of the foregrounds is parallel to the CMB (cos0i = 1 
or cos 02 = 1), the FDF blows up as is expected. 



3 Applications 

I now apply the formalism of the previous section to several practical questions. To set the 
stage, consider figure 1 which shows the spectra of the three galactic foregrounds of interest 
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Figure 1: Unnormalized shapes of the different components in the sky. 

to us: synchrotron, bremsstrahlung, and dust. The shape of the bremsstrahlung spectrum 
is pretty well fixed by atomic physics. If we parametrize a given shape by 



(19) 



then pbrem — —2.1 with an uncertainty of a few percent. The spectral index of synchrotron 
is much more uncertain; typical estimates suggest that Psync = —2.9 ± 0.2. Finally the 
uncertainty in the spectral index of dust is even more pronounced; in fact it is not even 
clear if a fit along the hues of equation |l^ is adequate to represent the complexities of dust. 
Nonetheless, a rough estimate might give pdust = 1.5 ±0.5. Figure 1 illustrates these different 
shapes. Thus dust is expected to dominate at high frequencies and the other components at 
low frequencies. 



3.1 One Component: Synchrotron 

Let us start with the simplest possible example: one foreground component, synchrotron, 
with spectral index assumed known. This example, while crude, is not really that unrealistic. 
At low frequencies dust can be safely ignored, and bremmstrahlung typically comes in lower 
than synchrotron. Further, as we will see in the next subsection, the uncertainty in the 
spectral index introduces very little error. 

According to equation |1|, the uncertainty in our determination of the CMB temperature 
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has only one piece if the spectral shape of the foreground is known: 
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al = FDFVJ°^' ^ FDF^-^ (20) 

Nch 

where the limit ajj^^ a'^/Nch holds in the case of equal and uncorrelated noise in each 
channel with variance a, as long as we are safely in the the Ray leigh- Jeans limit. In this 
simple example with only one foreground to be projected out, we saw in equation |T6| that 
FDF = 1/ (l - (F° ■ F^f/P ■ F^) The dot products and hence the FDF depends on the 
shape we assume for synchrotron emission (here I will assume Psync = —2.9) and also on the 
placement of the frequency channels. 

What is the optimal placement of frequency channels? And how many are needed? Let 
us first consider two frequency channels. For simplicity I will assume that they are centered 
about z/ = 40 GHz. Figure 2 shows the FDF as a function of the difference uugh — ^low For 
example i^Mgh — ^low = 10 GHz indicates two channels placed at 35 and 45 GHz. The FDF in 
that case is a little over three: the signal to noise is degraded by this factor. For very small 
frequency differences, it is difficult to disentangle the CMB component from synchrotron; 
hence the FDF factor is high. If the frequencies can be spread far apart, separating CMB 
from synchrotron becomes easier and the FDF decreases accrordingly. In the limit of very 
large frequency difference, the FDF asymptotes to: 



lim FDF = J— (21) 
a™ VNch-Nfg ^ ' 

in this case ^/2. To see why, note that in the absence of foregrounds, the additional infor- 
mation from all the channels beats down the noise by a factor of 1/ y/Nch] this is the factor 



explicitly present in equation ^ When a foreground component is present, at least one of 
the channels must be used to determine the foreground amplitude. Thus even in the ideal 
case, when the foreground component can be easily distinguished from the CMB, there is 
still one fewer channel with which to measure the CMB. Hence, the true noise is now down 



by a factor of 1/y/Nch — 1- And on it goes as more foreground amplitudes must be separated. 
Note that these arguments are only valid in the Ray leigh- Jeans limit; For higher frequencies. 



the limits in equation |20| and equation |21| no longer apply. 

Now consider three frequency channels. It is clear that it is best to get as large a spread in 
the frequencies as possible. But where best to place the middle frequency channel? Figure 3 
shows the FDF as a function of the frequency of the middle frequency channel when z/iow = 30 
GHz and z/high = 50 GHz. Figure 3 shows that the optimal place for the middle frequency 
channel is at u = 50 GHz! At first, this is surprising, but it makes sense upon further 
refiection: the lowest channel is used to separate out the synchrotron component. The other 
channels are best placed where they will get the least contaminated by synchrotron; thus 
all other channels should go as high in frequency as possible. Of course, this example is 
somewhat artificial: when more than one foreground component is projected out, it becomes 
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Figure 2: FDF as a function of the difference between the highest and lowest frequencies 
in an experiment when synchrotron with assumed index —2.9 is fitted for. The extreme 
channels are centered around 40 GHz (so (z/iow + ^'high)/2 = 40 GHz). The curves with more 
than two channels have their frequencies equally spaced between the two extremes. FDF is 
lowered - hence the experiment has the best discrimination against foregrounds - when the 
frequency spread is as large as possible. 
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Figure 3: FDF as a function of the placement of the central frequency in an experiment 
with three channels, the other two at 30 and 50 GHz. Again the foreground component is 
synchrotron with assumed index —2.9. The optimal place to locate the third channel is at 
50 GHz, where FDF is minimized. 

important to space out the channels more evenly. Nonetheless, I hope this simple example 
alerts experimenters to the possibility that the best signal to noise may be achieved with an 
unorthodox positioning of the frequency channels. In this simple example, the FDF varies 
from 2.3 to 1.7, i.e. by roughly 30%, as one varies the placement of the middle channel. So 
clever positioning of the intermediate channels could be an easy way to increase the final 
signal to noise. 

Figure 2 shows the FDF for this case of three frequency channels as a function of the 
difference between the highest and lowest channel. In this graph, the middle channel is not 
placed in the optimal position (usually at the highest frequency possible), but all the channels 
are evenly spaced. (Thus the point corresponding to Vhigh — z^iow — 10 GHz and A^ch = 3 
has channels ai, u — 35,40,45 GHz.) It is interesting that, except for the largest frequency 
spreads, adding extra channels docs not really help in disentangling the foregrounds. (This 
point was also made by Brandt et al. (1994).) Certainly going beyond A^ch — 3 provides 
very little gain in this simple case. 
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3.2 Uncertain foreground shape 



We can generalize the discussion of the previous subsection by accounting for the fact that 
the shape of the synchrotron spectrum is not perfectly determined. If we allow for this 



uncertainty, there arises a new term in the variance of the estimator. Following equation |13 
we see that 



C"shape 



rpsyn 



00-' 



Ticmb 



(22) 



where again I emphasize that T^y'^'^ is the true synchrotron temperature, with a spectral shape 
that differs from the assumed one. We will suppose that the true shape of the synchrotron 
spectrum is still given by equation but with spectral index p ^ —2.9, the assumed index. 
Figure 4 shows the error induced by assuming the wrong spectral index. For the kind of 
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Figure 4: Ushape: the variance in the determination of the CMB temperature due to uncer- 
tainty in the shape of the foreground spectrum being fitted for. The assumed spectral index 
is —2.9; if the true index is equal to this, then Ushape vanishes. 

uncertainty typically measured for synchrotron, Ap ~ 0.2, the error induced is less than a 
few percent of the synchrotron amplitude. Thus, if the synchrotron amplitude is 40/iK, the 
uncertainty in the spectral index contributes less than 1 /zK to the total error. This error 
is very small and for reasonable noise values will be much smaller than the (FDF) a / VNch 
factor discussed above. 
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It is instructive to understand why the uncertainty in the spectral index leads to very 
small errors in the CMB temperature determination. By projecting out the p = —2.9 
component, we are looking in the space perpendicular to the shape vector defined hj p = 
—2.9. But, the vector defined hj p = —2.8 is almost perfectly parallel to the p = —2.9 
shape vector. Thus, it has a very small component in the perpendicular space. Unless the 
amplitude is extremely large, the perpendicular component is negligible. 

Figure 4 shows that, even when we project out only synchrotron emission, we also succeed 
in eliminating a large fraction of the bremmstrahlung (with p — —2.1) as well. In this 
example, only 15% of the bremmstrahlung amplitude remains after projecting out ap = —2.9 
component. So the simple projection of p = —2.9 is sufficient for all but the most sensitive 
experiments. 




Figure 5: cXshape for different true foreground shapes as a function of the frequency range of 
an experiment. Solid lines are for a 4 channel experiment (with equally spaced frequencies); 
dashed fines for a 2 channel experiment. In all cases, the range is centered around 40 GHz. 

Figure 5 shows how o"shape varies as the frequency coverage changes. In this example, 
increasing the frequency range always leads to an increase in cXshape- For some foregrounds, 
increasing the number of channels also leads to an increase in CTghape) although this is not 
true for dust here. I have not been able to figure out any general principles for minimizing 
(jghape; fortunately, it is simple enough to deal with each case individually. 
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3.3 Projecting out two components 

This subsection deals with an analysis question. How best to analyze the data? In particular, 
should one attempt to fit for several components or is it best to fit for fewer components? 
I will argue that projecting out two components often will lead to a larger variance than if 
one simply projected out one component as in §3.1. 




Figure 6: FDF as a function of frequency range in a three channel experiment. Projecting 
out two components leads to a much larger FDF. 

Figure 6 shows the FDF for an experiment with three frequency channels when both 
synchrotron [with index —2.9] and bremmstrahlung are fitted for. For comparison, also 
shown is the FDF if only synchrotron was fitted for. Again the central channel is at u = 40 
GHz. In all cases, the FDF is much higher if both components are fitted for. For example, 
with channels at u — 30, 40, 50, figure 6 shows that the "one-component" FDF = 2 while the 
"two-component" FDF = 14. 

Let me pursue this example further. When is it advantageous to project out two compo- 
nents? The total variance in the two-component analysis is 

I two-component = ^ . (23) 

There is also a small uncertainty due to the unknown shape but this is very small so I 
neglect it. In the one-component analysis, we must include the shape uncertainty since the 
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bremmstrahlung amplitude is not projected out. Thus 



(T,' lone-component = (2)^5°^" + (0.18)'' {T^,,Ju = 30GHz)). (24) 

The coefficient 0.18 in equation ^ can be simply read off of figure 4. It becomes useful to 
analyze the data by fitting for two components only when cr2|t„o_component < cr^ I one-component- 



Using equations and we find that this occurs when 

(TbU(^ = 30GHz))V2 



Of) 



> 77. (25) 



For even the most sensitive experiments, we do not expect foreground amplitudes of this 
magnitude. So in this example, it would be best to analyze the experiment by fitting for 
only the synchrotron component. 

One must pursue each example on a case by case basis. This simple analytic technique 
should prove useful in deciding how best to analyze the data. This simple example suggests 
that fitting for fewer components leads to a smaller variance in the CMB temperature; this 
agrees with the general point made by Brandt et al. (1994). Now let us turn to a more 
quantitative comparison with that work. 



3.4 Comparison with Brandt et al. 

The analytic techniques presented here can be compared with the Monte Carlos performed by 
Brandt et al. (1994). Here I focus on one example of theirs, a seven channel experiment with 
equally spaced frequencies between 25 and 38 GHz. I will not describe their methodology in 
detail [please see their paper for a lucid description of what they did]. For the purposes of 
comparison, note that they were interested in the same quantity I have been focusing on: the 
total variance in the determination of the CMB temperature. I have called it ae; they called 
it -Erms- Given the experimental configuration and the average foreground levels, we can 
plot this variance as a function of instrumental and atmospheric noise per channel, a [in their 
notation roughly equivalent to ^] . Figure 7 shows a comparison of the analytic technique and 
the Monte Carlos. The points are two different techniques that they used to extract the CMB 
temperature. They allow for free synchrotron amplitude [model Q2] and free synchrotron 
and bremmstrahlung amplitudes [model P3]. This corresponds to projecting out iVfg = 1,2 
foregrounds respectively. The curves show ae with these two projections. To get these curves 
I needed (T^y^^) and (T^^^^)', I took the same values they used in their Monte Carlos. 

The agreement is excellent and shows clearly that the simple analytic technique ade- 
quately describes the situation. The shapes of their curves now becomes obvious: at low 
noise levels [small a], CTghape dominates over the FDF-enhanced noise. Since cxshape is inde- 
pendent of noise, the total variance is also independent of noise in this regime. That is, at 
low a, a 9 is constant. In the opposite limit, FDF-enhanced noise dominates over Ushape, so 
the total variance increases linearly with a. 
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Figure 7: The variance in the determination of the CMB temperature as a function of the 
noise per channel. The points denote two different methods used by Brandt et al (1994) 
to extract the CMB temperature. The hnes are the variances one gets with the analytic 
formula of Eq. 1 fitting for one and two foreground components 
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One final point about our approaches. They presented many other "model" s for extract- 
ing out the CMB temperature. For example, one of their models allowed the synchrotron 
index to be a free parameter. Within the analytic framework presented here, I cannot allow 
the shapes to be free parameters. However, the variances using such models are much higher 
than the variances in the models where only the amplitudes are allowed to vary. So I would 
argue that the analytic technique cannot do everything but it can do the things that are 
worth doing. 

3.5 Breaking up bands and noise correlations 

We have seen in the previous sections that adding more frequency channels to an experiment 
is not necessarily a good thing. For, intermediate channels are not as effective in separating 
out different spectra; a longer lever arm with good sensitivity at both ends is often preferable. 
In this section 1 focus on another possible danger of splitting up bands. Often when a given 
frequency band is split up, the noise in the new channels is correlated. How does correlated 
noise impact on the decision to split up bands? Here, I address this question in the context 
of a simple example. 

Consider an experiment with two channels in the Rayleigh- Jeans regime, say v — 15, 45 
GHz. The noise in each channel is assumed to have variance cr, so in the absence of fore- 
grounds, the variance in the determination of the CMB temperature would be o"/\/2. If 
we wish to project out synchrotron emission (with assumed spectral index —3), then this 
variance is increased by the FDF. In this case a simple computation yields FDF = 1.51, so 
the total variance in the experiment is 

ae,2 = 1.07(7 (26) 

where the subscript 2 denotes the number of channels. Is it worthwhile to add two new 
frequency channels at v — 25, 35 GHz? I will asume that in so doing, the noise in each 
channel increases by v/2, so that in the absence of foregrounds, the variance would still be 
( V2a) /y/l = a/V2. If there were no correlations introduced between the different channels, 
then we could do a simple calculation and find that FDF = 1.33. Thus the total variance in 
this 4— channel case is only 0.94f7, smaller than in the two channel case and perhaps worth 
the effort. 

However, if correlations amongst the different channels are introduced, then the calcu- 
lation becomes slightly less trivial. Here I carry out the calculation in this correlated case 
for several reasons. First, this will give us a sense of whether or not it is important to avoid 
correlations. But more importantly, I hope that this provides yet another example of how 
useful the formalism of §2 can be when it comes to analyzing specific problems. 

For simplicity I will assume that the two lowest channels are correlated as are the two 
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highest channels, so the new noise correlation matrix is 



C = 2a' 



For the calculation we will need the inverse of C. A short calculation shows that 
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(27) 
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(28) 



We can now immediately calculate the variance in the absence of foregrounds: 

(Of _ 1 



Safe ^ab 



(29) 



where in the first line I have assumed that we are deep enough into the Rayleigh- Jeans to 
set = 1. Thus, the variance in such an experiment - in the absence of foregrounds - 
increases due to correlations by a factor of \/l + e. This is a very simple way of saying what 
I and my collaborators tried to illustrate in Dodelson, Kosowsky, and Myers (1995). Now 
let us include the effects of foregrounds. Our standard formula gives 



FDF2 



1 - (Fi ■F0)2/(Fi -Fi) 



(30) 



so we need to calculate the two dot products. The only complication here is that we need 
to account for the non-diagonal structure of C. Thus, 



ab 



(0)2 



2(t2(i + e) 



and 



F^-F^ 



(0) 



^T a{Flf-HFlF^ + FlFl) 



(31) 



(32) 



With these expressions for the dot products we can now evaluate the FDF with equation ^ 
Figure 8 shows the variance in the determination of the CMB temperature as a function 
of the correlation between the channels when the synchrotron is fitted for. Apparently, 
increased correlation does not significantly increase the variance. So, at least in this example, 
noise correlation should not deter experimenters from adding new channels. 
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Figure 8: The variance in the determination of the CMB temperature as a function of 
the correlation amongst the different frequency channels. This is to be compared with the 
horizontal line, the variance in the two channel case when there is no correlation. Since the 
four channel curve is lower than the horizontal line, it would always be advantageous to add 
the extra channels in this case even if correlations were introduced. 



3.6 Current Experiments 

To get a feel for how well current experiments are doing at separating out foregrounds, I 
compiled Table 1. For each experiment, the FDF is computed for a given spectral index. 
For example the FDF for COBE fitting for bremmstrahlung is 1.75. Also shown is the 
uncertainty due to the shape. Again for COBE, if bremmstrahlung is fit for, then a dust 
component [with index 1.5] contributes an uncertainty CTghapc = ^■'^'^{T^owcst)^^'^- For COBE, 
the lowest frequency channel is at 31 GHz. At this frequency, one expects an rms dust an- 
tenna temperature of order a few /xK, so - in the absence of any other maps - the uncertainty 
due to dust would be of order 10//K. [This is not intended to be a rigorous estimate of the 
uncertainty due to dust, just a guide to reading the table. COBE has access to - and used 
much other information to get a handle on dust. See for example Bennett et al. (1994).] 

A cursory look at some of the other experiments in Table 1 shows that typical FDF's 
are of order 1 — 4. Bolometer experiments like MAX and MSAM do very well at projecting 
out dust. [Note though that MAX3 in particular could not distinguish well between CMB 
and bremmstrahlung.] The HEMT experiments do not have large frequency coverage, so 
they discriminate less well than the high frequency experiments. However, the recent modi- 
fications to the South Pole and the Saskatoon [additions of higher frequency channels] have 
significantly reduced their FDF's. 

The last line of Table 1 presents the FDF for a hypothetical experiment with equally 
spaced frequencies between 30 and 120 GHz (this case was also analyzed by Brandt et al. 
(1994)). Analyzing by projecting out two components leads to a variance squared equal to 
(2.97 X a/Viy + (.169rbrem(30GHz))2, where a is the noise in each channel. 
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Table 1: FDF's for selected experiments. 



Exi)eiiiiieiit 



Assuiiied Index 



FDF 



Forogronnd with p 



lowest 



COBE" 

FIRS'" 

MAX3^ 

MAX4<^ 

MAX3 

MAX4 

MAX3 

MAX4 

MSAMP 

SK93^ 

SK94S 

SP94^ 
Tenerife-' 



-2.1 
1.5 
-2.1 
-2.1 
1.5 
1.5 
1.5 
1.5 
1.5 
-2.9 
-2.9 
-2.9 
-2.9 
-2.9 



1.75 

I. 02 

II. 5 
4.09 
1.12 
1.06 
1.12 
1.06 
1.02 
4.48 
2.35 
4.00 
2.40 
1.54 



1.5 

2 - 
1.5 
1.5 • 
-2.1 
-2.1 
2 - 
2 - 
2 - 
-2.1 
-2.1 
-2.1 
-2.1 
-2.1 



> 5.32 

2.45 

> 73.7 
* 12.4 

2.09 
1.16 
1.34 
1.95 
2.48 
.234 
.180 
.225 
.179 
.094 



Satellite 



-2.9,1.5 



2.97 



-2.1 ^ .169 



Bennett ct al. (1994) 
Ganga et al. (1994) 
Meinhold et al. (1993) 
Clapp et al. (1995) 
Cheng et al. (1994) 
Wollack et al. (1993) 
Netterfield et al. (1994) 
Gaier et al. (1992) 
Gundersen et al. (1994) 
Hancock et al. (1994) 



19 



4 Conclusions 



This paper has introduced an analytic technique that can be used to help design an experi- 
ment and to analyze data. The main result coming out of this analytic treatment is that the 
variance in the determination of the CMB temperature has two components. First, there 
is a component proportional to noise; due to fitting for foregrounds, noise is amplified by 
the FDF. Second, there is a component cTghape proportional to the foreground amphtudes. 
This component vanishes if the foreground spectra are known, but is non-zero if there is 
some uncertainty in the shapes. This simple model of CMB extraction was shown in §3.4 to 
reproduce the Monte Carlo results of Brandt et al (1994) very accurately. 

I think the most useful thing to emerge is the technique itself, which is easy to understand 
and implement. Any given experiment will have its own set of complications, so it is dan- 
gerous to make general conclusions about the "best" set of frequency channels. Nonetheless, 
on the basis of the work in §3, there are several general principles that should be considered 
in any experimental plan/analysis. 

• A wide range of frequencies does best at minimizing the FDF. In general this would lead 
one to go with large frequency ranges. Indeed, I would argue that experiments with 
bolometers have been more successful to date at extracting the CMB since they allow 
a larger frequency range. However, one can envision circumstances where increasing 
the range is not beneficial Q. In particular, as shown in §3.2, increasing the frequency 
range often leads to a larger cTghape- This effect can be even more dramatic if the new 
frequencies are more sensitive to a different foreground component [e.g. a 120 GHz 
channel added to a low frequency experiment would be more sensitive to dust]. 

• Equal spacing of the intermediate channels is not always the optimal way to go. Fur- 
thermore, adding more intermediate channels is also not necessarily beneficial. How- 
ever, it does not appear - at least from the example studied in §3.5 - that noise 
correlations amongst different channels should be a deterrent in this regard. 

• In terms of analysis, in agreement with the results of Brandt et al. (1994), I found 
in §3.3 that it is best to fit for as few components as possible. A cursory glimpse at 
present experiments suggests that their signal to noise is degraded due to fregrounds 
by a factor ranging from one to four, with bolometers at the low end and HEMPTs 
at the high end. A satellite experiment with frequencies ranging from 30 to 130 GHz 
would see its signal to noise degraded by about three. 
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A Derivation of Variances 



We want to solve equation ^ for the free parameters, the amplitudes 6'*. This minimization 
requirement is satisfied when 



■ f - ^ F'O' 
Using the definition of K in equation R| leads to 



0. 



(33) 



Kji9' = F^ ■ T. 



(34) 



i=0 



Multiplying by K ^ and summing over j leads to 

6' = J2 K~\jF^ ■ f. 

j=0 



(35) 



The i = component of this equation is the estimator for the CMB temperature presented 
in equation p!0| . 

Now we want to calculate the variance of the estimator for the CMB temperature. Start 
from equation and use |10| for 5" and equation |^ for T. Then, 



/ ^fg r ^fg 

\j=0 



■ i=0 



(36) 



Consider the i = term here. This is 



(37) 



3 3 

SO this part of the sum exactly cancels the term in equation |36|. Thus we are left with 

2 



/ ^fg r ^fg 

\j=0 '-1=1 



(38) 
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The first term in square brackets is exactly CTghape in equation |I3]. The noise term is shghtly 
more complicated. It is 



Affg ^ 2 Ntg Affg 



j=0 ab j'=0 a'b' 



aa'bb' 



jj' \ aa' / 



(39) 



In going from the first to the second line here I have used equation 0. The term in parentheses 
on the last line in equation ^ is by definition equal to ■ F^' = Kjji. This contracts with 
one of the ^"^'s to give 5oj. Thus all that is left of the sum is K^^qq. This corresponds to 
the FDF in equation |12 . 



B Marginalization 

This appendix presents what appears to be another way to extract the CMB signal. For a 
long time I thought that this estimator was better than the one presented in the body of 
the paper. I even gave a talk or two explaining that this estimator was preferable to any 
other. This is not true. Both estimators are identical. In this appendix, I first present the 
other method and then prove that both estimators, although they look completely different, 
are in fact identical. Albert Stebbins and I in Dodelson & Stebbins (1994) wrote about 
marginalization and Dodelson & Kosowsky (1995) describes some more marginalization work. 
All of this is now shown to be equivalent to the best fit technique used for example by Cheng 
et al. (1994) to analyse the MSAM data. 

The idea is to project out all the foreground sources. The iVfg foregrounds span an 
A^fg— dimensional subspace of call this subspace T. All foreground contributions to 

the signal live in JF. The orthogonal complement of JF, JF_|_, is the space which contains all 
vectors orthogonal to the foregrounds. Thus any vector in jFj^ is completely independent of 
foregrounds. What we need to do, therefore, is project the observed temperature vector on 
to T^. This projected temperature will be independent of any foregrounds and hence will 
provide an unbiased estimate of the CMB temperature. To project on to jFj^, we first need 
a set of basis vectors in jFj^. Let us call these 

z^'^ r = l,...,Ar,h-Arfg. (40) 

These basis vectors are chosen to be orthonormal, so they are perpendicular to each other 
and they have unit norm: 

^(0 . ^{^) = Srs. (41) 
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Of course there are Nch — N{g of these vectors since they span the A^^ch — A^fg dimensional 
space Finally, by the definition of the vectors z^'''^ must satisfy 

#'-).F* = r = l,...,iV,h-iVfg ; ^ = 1, . . . , iVfg. (42) 

With the basis vectors z^'^\ we can form the projection operators which project any vector 
in the full A^^^h dimensional space on to J-'±, the space independent of foregrounds. For an 
arbirtary vector x in the full space, 

x^= J2 ^^'^ (^^'^ ■ ^) (43) 

r=l 

is the projection on to J-'±. Thus x± is independent of any foregrounds. 

We are now in a position to get the marginalization estimate for the CMB temperature. 
First we project the observed temperature on to the space independent of foregrounds. Then 
we find the CMB component of this projected temperature. The estimator is therefore 



r=l r=l 



The denominator in equation ^ is simply to get the normalization right. [Thus when 
T = the estimator 6' will give t^]. 

This estimator looks [to me] completely different from the estimator in equation I 
now show that the two are equivalent. Let me write 6 = a - T and 9' = a' ■ T so 

j=0 

a' = . (45) 

If I can show that these two vectors are equivalent, then I have shown that the two estimators 
6 and 6' are also equivalent. One way to do this is to pick a basis which spans the full N^h 
dimensional space and show that for each basis vector b, a ■ b = a' ■ b. As a basis consider 
the iVfg vectors F^{j > 0) together with the Nch — N^g unit vectors z^^\ It is easy to check 
that a ■ = a' ■ = for all j. Of course this is the way the estimators were constructed, 
to be independent of foregrounds. Now I will show that a - z^^^ = a' ■ z^'^'^ for all r. 



a ■ z 



(r) 



a' ■ z^'^ = . ^ . F° ■ i^*") (46) 
■ Fl 

Thus to show that the two estimators are identical, I need only show that 

1 



{K-'U (47) 
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To prove equation ^ let us calculate 



a ■ a 

n' 



since ■ F^ = Kjji. Since a is perpendicular to all the foregrounds it lies in jFj^. Thus 
can be written as 

a = ^zM(a.zM). (41 



Therefore, another way of writing the dot product in equation EH 



IS 



rr' 



= EC^* (5( 

r 

Using equation EHl we can now equate 



2 = (^~')oo. (5] 



But the left hand side here is equal to the left hand side of equation ^ by definition. So 
identity is proven. 
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